Statistics Learning And Universal Grammar: Modeling Word Segmentation
نویسندگان
چکیده
This paper describes a computational model of word segmentation and presents simulation results on realistic acquisition In particular, we explore the capacity and limitations of statistical learning mechanisms that have recently gained prominence in cognitive psychology and linguistics.
منابع مشابه
Unsupervised NLP and Human Language Acquisition: Making Connections to Make Progress
Natural language processing and cognitive science are two fields in which unsupervised language learning is an important area of research. Yet there is often little crosstalk between the two fields. In this talk, I will argue that considering the problem of unsupervised language learning from a cognitive perspective can lead to useful insights for the NLP researcher, while also showing how tool...
متن کاملTowards a Universal Grammar for Natural Language Processing
Universal Dependencies is a recent initiative to develop crosslinguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. In this paper, I outline the motivation behind the initiative and explain how the basic design principles follow from these...
متن کاملAn Efficient Japanese Parsing Algorithm for Computer-Assisted Language Learning
Instructional grammar is often used in Computer-assisted Language Learning (CALL) and the grammatical error detection is an important feature. However, it is not an easy task in Japanese language. There is no delimiter separating consecutive words in Japanese sentences. Word segmentation is a process in which proper word boundaries are identified. Before syntactic parsing of a Japanese sentence...
متن کاملOnline Adaptor Grammars with Hybrid Inference
Adaptor grammars are a flexible, powerful formalism for defining nonparametric, unsupervised models of grammar productions. This flexibility comes at the cost of expensive inference. We address the difficulty of inference through an online algorithm which uses a hybrid of Markov chain Monte Carlo and variational inference. We show that this inference strategy improves scalability without sacrif...
متن کاملThe Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal
It has been well documented how language-specific cues may be used for word segmentation. Here, we investigate what role a language-independent phonological universal, the sonority sequencing principle (SSP), may also play. Participants were presented with an unsegmented speech stream with non-English word onsets that juxtaposed adherence to the SSP with transitional probabilities. Participants...
متن کامل